Water Observations from Space (WOFS)¶

This notebook demonstrates the Australian Water Observations from Space (WOFS) algorithm but uses Sentinel-2 instead of Landsat. This water detection algorithm is an improvement over the Landsat QA water flag or the NDWI index for water identification. For more information, visit this website:

http://www.ga.gov.au/scientific-topics/hazards/flood/wofs

Import Dependencies and Connect to the Data Cube ▴¶

In [1]:
import datacube
dc = datacube.Datacube(app='Water_Observations_from_Space')
from datacube.utils import masking

import sys, os
os.environ['USE_PYGEOS'] = '0'

import datetime
import matplotlib.pyplot as plt
import numpy as np  
import xarray as xr
import pandas as pd

from dea_tools.plotting import rgb, display_map
from dea_tools.bandindices import calculate_indices

### EASI tools
sys.path.append(os.path.expanduser('../scripts'))
from ceos_utils.data_cube_utilities.clean_mask import landsat_clean_mask_invalid, landsat_qa_clean_mask
from easi_tools import EasiDefaults
from easi_tools import notebook_utils
easi = EasiDefaults() # Get the default parameters for this system
Successfully found configuration for deployment "eail"
In [2]:
cluster, client = notebook_utils.initialize_dask(use_gateway=False)
display(cluster if cluster else client)
print(notebook_utils.localcluster_dashboard(client, server=easi.hub))

LocalCluster

a95524e7

Dashboard: http://127.0.0.1:8787/status Workers: 4
Total threads: 8 Total memory: 29.00 GiB
Status: running Using processes: True

Scheduler Info

Scheduler

Scheduler-33979698-f0ff-45ef-be16-ee256470b9ed

Comm: tcp://127.0.0.1:39881 Workers: 4
Dashboard: http://127.0.0.1:8787/status Total threads: 8
Started: Just now Total memory: 29.00 GiB

Workers

Worker: 0

Comm: tcp://127.0.0.1:39245 Total threads: 2
Dashboard: http://127.0.0.1:32811/status Memory: 7.25 GiB
Nanny: tcp://127.0.0.1:46367
Local directory: /tmp/dask-worker-space/worker-if7j1mo_

Worker: 1

Comm: tcp://127.0.0.1:36721 Total threads: 2
Dashboard: http://127.0.0.1:33527/status Memory: 7.25 GiB
Nanny: tcp://127.0.0.1:34615
Local directory: /tmp/dask-worker-space/worker-ui0nvnw1

Worker: 2

Comm: tcp://127.0.0.1:46425 Total threads: 2
Dashboard: http://127.0.0.1:34175/status Memory: 7.25 GiB
Nanny: tcp://127.0.0.1:37031
Local directory: /tmp/dask-worker-space/worker-247ycyhp

Worker: 3

Comm: tcp://127.0.0.1:36145 Total threads: 2
Dashboard: http://127.0.0.1:38859/status Memory: 7.25 GiB
Nanny: tcp://127.0.0.1:33307
Local directory: /tmp/dask-worker-space/worker-as7dc2j7
https://hub.eail.easi-eo.solutions/user/jhodge/proxy/8787/status
In [3]:
from datacube.utils.aws import configure_s3_access
configure_s3_access(aws_unsigned=False, requester_pays=True, client=client)
Out[3]:
<botocore.credentials.Credentials at 0x7fed5b5c00d0>

Choose Platforms and Products ▴¶

In [4]:
# Define the Product
product = "s2_l2a"

Define the Extents of the Analysis ▴¶

In [5]:
# Select an analysis region (Latitude-Longitude) 
# Select a time period within the extents of the dataset (Year-Month-Day)

# Mombasa, Kenya
# latitude = (-4.05, -3.95) 
# longitude = (39.60, 39.68) 

# latitude=easi.latitude
# longitude=easi.longitude
latitude = (36.3, 36.5)
longitude = (-114.325, -114.43)

# Define Time Range
# Landsat-8 time range: 07-Apr-2013 to current
time_extents = ('2021-01-01', '2021-12-31')
In [6]:
# The code below renders a map that can be used to view the analysis region.
display_map(longitude,latitude)
/env/lib/python3.10/site-packages/dea_tools/plotting.py:313: FutureWarning: This function is deprecated. See: https://pyproj4.github.io/pyproj/stable/gotchas.html#upgrading-to-pyproj-2-from-pyproj-1
  all_longitude, all_latitude = transform(Proj(crs), Proj('EPSG:4326'), all_x,
Out[6]:
Make this Notebook Trusted to load map: File -> Trust Notebook

Load and Clean Data from the Data Cube ▴¶

After loading, you will view the Xarray dataset. Notice the dimensions represent the number of pixels in your latitude and longitude dimension as well as the number of time slices (time) in your time series.

In [7]:
measurements = ['red', 'green', 'blue', 'nir', 'swir_1', 'swir_2', 'SCL']
data_names = measurements.copy()
data_names.remove('SCL')
In [8]:
s2_dataset = dc.load(latitude = latitude,
                          longitude = longitude,
                          time = time_extents,
                          product = product,
                          output_crs = 'epsg:6933',
                          resolution = (-10,10),
                          measurements = measurements,
                          group_by = 'solar_day',
                          dask_chunks = {'time':1}) 
In [9]:
s2_dataset
Out[9]:
<xarray.Dataset>
Dimensions:      (time: 73, y: 2064, x: 1014)
Coordinates:
  * time         (time) datetime64[ns] 2021-01-05T18:34:07 ... 2021-12-31T18:...
  * y            (y) float64 4.355e+06 4.355e+06 ... 4.334e+06 4.334e+06
  * x            (x) float64 -1.104e+07 -1.104e+07 ... -1.103e+07 -1.103e+07
    spatial_ref  int32 6933
Data variables:
    red          (time, y, x) uint16 dask.array<chunksize=(1, 2064, 1014), meta=np.ndarray>
    green        (time, y, x) uint16 dask.array<chunksize=(1, 2064, 1014), meta=np.ndarray>
    blue         (time, y, x) uint16 dask.array<chunksize=(1, 2064, 1014), meta=np.ndarray>
    nir          (time, y, x) uint16 dask.array<chunksize=(1, 2064, 1014), meta=np.ndarray>
    swir_1       (time, y, x) uint16 dask.array<chunksize=(1, 2064, 1014), meta=np.ndarray>
    swir_2       (time, y, x) uint16 dask.array<chunksize=(1, 2064, 1014), meta=np.ndarray>
    SCL          (time, y, x) uint8 dask.array<chunksize=(1, 2064, 1014), meta=np.ndarray>
Attributes:
    crs:           epsg:6933
    grid_mapping:  spatial_ref
xarray.Dataset
    • time: 73
    • y: 2064
    • x: 1014
    • time
      (time)
      datetime64[ns]
      2021-01-05T18:34:07 ... 2021-12-...
      units :
      seconds since 1970-01-01 00:00:00
      array(['2021-01-05T18:34:07.000000000', '2021-01-10T18:34:08.000000000',
             '2021-01-15T18:34:07.000000000', '2021-01-20T18:34:08.000000000',
             '2021-01-25T18:34:08.000000000', '2021-01-30T18:34:08.000000000',
             '2021-02-04T18:34:07.000000000', '2021-02-09T18:34:07.000000000',
             '2021-02-14T18:34:06.000000000', '2021-02-19T18:34:07.000000000',
             '2021-02-24T18:34:06.000000000', '2021-03-01T18:34:08.000000000',
             '2021-03-06T18:34:07.000000000', '2021-03-11T18:34:07.000000000',
             '2021-03-16T18:34:07.000000000', '2021-03-21T18:34:07.000000000',
             '2021-03-26T18:34:07.000000000', '2021-03-31T18:34:05.000000000',
             '2021-04-05T18:34:05.000000000', '2021-04-10T18:34:03.000000000',
             '2021-04-15T18:34:03.000000000', '2021-04-20T18:34:03.000000000',
             '2021-04-25T18:34:03.000000000', '2021-04-30T18:34:05.000000000',
             '2021-05-05T18:34:05.000000000', '2021-05-10T18:34:07.000000000',
             '2021-05-15T18:34:07.000000000', '2021-05-20T18:34:08.000000000',
             '2021-05-25T18:34:08.000000000', '2021-05-30T18:34:09.000000000',
             '2021-06-04T18:34:09.000000000', '2021-06-09T18:34:08.000000000',
             '2021-06-14T18:34:09.000000000', '2021-06-19T18:34:08.000000000',
             '2021-06-24T18:34:08.000000000', '2021-06-29T18:34:10.000000000',
             '2021-07-04T18:34:09.000000000', '2021-07-09T18:34:11.000000000',
             '2021-07-14T18:34:10.000000000', '2021-07-19T18:34:11.000000000',
             '2021-07-24T18:34:10.000000000', '2021-07-29T18:34:11.000000000',
             '2021-08-03T18:34:09.000000000', '2021-08-08T18:34:10.000000000',
             '2021-08-13T18:34:08.000000000', '2021-08-18T18:34:10.000000000',
             '2021-08-23T18:34:06.000000000', '2021-08-28T18:34:09.000000000',
             '2021-09-02T18:34:04.000000000', '2021-09-07T18:34:08.000000000',
             '2021-09-12T18:34:04.000000000', '2021-09-17T18:34:10.000000000',
             '2021-09-22T18:34:05.000000000', '2021-09-27T18:34:12.000000000',
             '2021-10-02T18:34:08.000000000', '2021-10-07T18:34:13.000000000',
             '2021-10-12T18:34:09.000000000', '2021-10-17T18:34:13.000000000',
             '2021-10-22T18:34:09.000000000', '2021-10-27T18:34:13.000000000',
             '2021-11-01T18:34:09.000000000', '2021-11-06T18:34:11.000000000',
             '2021-11-11T18:34:07.000000000', '2021-11-16T18:34:08.000000000',
             '2021-11-21T18:34:05.000000000', '2021-11-26T18:34:08.000000000',
             '2021-12-01T18:34:03.000000000', '2021-12-06T18:34:07.000000000',
             '2021-12-11T18:34:02.000000000', '2021-12-16T18:34:08.000000000',
             '2021-12-21T18:34:02.000000000', '2021-12-26T18:34:09.000000000',
             '2021-12-31T18:34:04.000000000'], dtype='datetime64[ns]')
    • y
      (y)
      float64
      4.355e+06 4.355e+06 ... 4.334e+06
      units :
      metre
      resolution :
      -10.0
      crs :
      epsg:6933
      array([4354685., 4354675., 4354665., ..., 4334075., 4334065., 4334055.])
    • x
      (x)
      float64
      -1.104e+07 ... -1.103e+07
      units :
      metre
      resolution :
      10.0
      crs :
      epsg:6933
      array([-11040925., -11040915., -11040905., ..., -11030815., -11030805.,
             -11030795.])
    • spatial_ref
      ()
      int32
      6933
      spatial_ref :
      PROJCS["WGS 84 / NSIDC EASE-Grid 2.0 Global",GEOGCS["WGS 84",DATUM["WGS_1984",SPHEROID["WGS 84",6378137,298.257223563,AUTHORITY["EPSG","7030"]],AUTHORITY["EPSG","6326"]],PRIMEM["Greenwich",0,AUTHORITY["EPSG","8901"]],UNIT["degree",0.0174532925199433,AUTHORITY["EPSG","9122"]],AUTHORITY["EPSG","4326"]],PROJECTION["Cylindrical_Equal_Area"],PARAMETER["standard_parallel_1",30],PARAMETER["central_meridian",0],PARAMETER["false_easting",0],PARAMETER["false_northing",0],UNIT["metre",1,AUTHORITY["EPSG","9001"]],AXIS["Easting",EAST],AXIS["Northing",NORTH],AUTHORITY["EPSG","6933"]]
      grid_mapping_name :
      lambert_cylindrical_equal_area
      array(6933, dtype=int32)
    • red
      (time, y, x)
      uint16
      dask.array<chunksize=(1, 2064, 1014), meta=np.ndarray>
      units :
      1
      nodata :
      0
      crs :
      epsg:6933
      grid_mapping :
      spatial_ref
      Array Chunk
      Bytes 291.41 MiB 3.99 MiB
      Shape (73, 2064, 1014) (1, 2064, 1014)
      Dask graph 73 chunks in 1 graph layer
      Data type uint16 numpy.ndarray
      1014 2064 73
    • green
      (time, y, x)
      uint16
      dask.array<chunksize=(1, 2064, 1014), meta=np.ndarray>
      units :
      1
      nodata :
      0
      crs :
      epsg:6933
      grid_mapping :
      spatial_ref
      Array Chunk
      Bytes 291.41 MiB 3.99 MiB
      Shape (73, 2064, 1014) (1, 2064, 1014)
      Dask graph 73 chunks in 1 graph layer
      Data type uint16 numpy.ndarray
      1014 2064 73
    • blue
      (time, y, x)
      uint16
      dask.array<chunksize=(1, 2064, 1014), meta=np.ndarray>
      units :
      1
      nodata :
      0
      crs :
      epsg:6933
      grid_mapping :
      spatial_ref
      Array Chunk
      Bytes 291.41 MiB 3.99 MiB
      Shape (73, 2064, 1014) (1, 2064, 1014)
      Dask graph 73 chunks in 1 graph layer
      Data type uint16 numpy.ndarray
      1014 2064 73
    • nir
      (time, y, x)
      uint16
      dask.array<chunksize=(1, 2064, 1014), meta=np.ndarray>
      units :
      1
      nodata :
      0
      crs :
      epsg:6933
      grid_mapping :
      spatial_ref
      Array Chunk
      Bytes 291.41 MiB 3.99 MiB
      Shape (73, 2064, 1014) (1, 2064, 1014)
      Dask graph 73 chunks in 1 graph layer
      Data type uint16 numpy.ndarray
      1014 2064 73
    • swir_1
      (time, y, x)
      uint16
      dask.array<chunksize=(1, 2064, 1014), meta=np.ndarray>
      units :
      1
      nodata :
      0
      crs :
      epsg:6933
      grid_mapping :
      spatial_ref
      Array Chunk
      Bytes 291.41 MiB 3.99 MiB
      Shape (73, 2064, 1014) (1, 2064, 1014)
      Dask graph 73 chunks in 1 graph layer
      Data type uint16 numpy.ndarray
      1014 2064 73
    • swir_2
      (time, y, x)
      uint16
      dask.array<chunksize=(1, 2064, 1014), meta=np.ndarray>
      units :
      1
      nodata :
      0
      crs :
      epsg:6933
      grid_mapping :
      spatial_ref
      Array Chunk
      Bytes 291.41 MiB 3.99 MiB
      Shape (73, 2064, 1014) (1, 2064, 1014)
      Dask graph 73 chunks in 1 graph layer
      Data type uint16 numpy.ndarray
      1014 2064 73
    • SCL
      (time, y, x)
      uint8
      dask.array<chunksize=(1, 2064, 1014), meta=np.ndarray>
      units :
      1
      nodata :
      0
      flags_definition :
      {'qa': {'bits': [0, 1, 2, 3, 4, 5, 6, 7], 'values': {'0': 'no data', '1': 'saturated or defective', '2': 'dark area pixels', '3': 'cloud shadows', '4': 'vegetation', '5': 'bare soils', '6': 'water', '7': 'unclassified', '8': 'cloud medium probability', '9': 'cloud high probability', '10': 'thin cirrus', '11': 'snow or ice'}, 'description': 'Sen2Cor Scene Classification'}}
      crs :
      epsg:6933
      grid_mapping :
      spatial_ref
      Array Chunk
      Bytes 145.70 MiB 2.00 MiB
      Shape (73, 2064, 1014) (1, 2064, 1014)
      Dask graph 73 chunks in 1 graph layer
      Data type uint8 numpy.ndarray
      1014 2064 73
    • time
      PandasIndex
      PandasIndex(DatetimeIndex(['2021-01-05 18:34:07', '2021-01-10 18:34:08',
                     '2021-01-15 18:34:07', '2021-01-20 18:34:08',
                     '2021-01-25 18:34:08', '2021-01-30 18:34:08',
                     '2021-02-04 18:34:07', '2021-02-09 18:34:07',
                     '2021-02-14 18:34:06', '2021-02-19 18:34:07',
                     '2021-02-24 18:34:06', '2021-03-01 18:34:08',
                     '2021-03-06 18:34:07', '2021-03-11 18:34:07',
                     '2021-03-16 18:34:07', '2021-03-21 18:34:07',
                     '2021-03-26 18:34:07', '2021-03-31 18:34:05',
                     '2021-04-05 18:34:05', '2021-04-10 18:34:03',
                     '2021-04-15 18:34:03', '2021-04-20 18:34:03',
                     '2021-04-25 18:34:03', '2021-04-30 18:34:05',
                     '2021-05-05 18:34:05', '2021-05-10 18:34:07',
                     '2021-05-15 18:34:07', '2021-05-20 18:34:08',
                     '2021-05-25 18:34:08', '2021-05-30 18:34:09',
                     '2021-06-04 18:34:09', '2021-06-09 18:34:08',
                     '2021-06-14 18:34:09', '2021-06-19 18:34:08',
                     '2021-06-24 18:34:08', '2021-06-29 18:34:10',
                     '2021-07-04 18:34:09', '2021-07-09 18:34:11',
                     '2021-07-14 18:34:10', '2021-07-19 18:34:11',
                     '2021-07-24 18:34:10', '2021-07-29 18:34:11',
                     '2021-08-03 18:34:09', '2021-08-08 18:34:10',
                     '2021-08-13 18:34:08', '2021-08-18 18:34:10',
                     '2021-08-23 18:34:06', '2021-08-28 18:34:09',
                     '2021-09-02 18:34:04', '2021-09-07 18:34:08',
                     '2021-09-12 18:34:04', '2021-09-17 18:34:10',
                     '2021-09-22 18:34:05', '2021-09-27 18:34:12',
                     '2021-10-02 18:34:08', '2021-10-07 18:34:13',
                     '2021-10-12 18:34:09', '2021-10-17 18:34:13',
                     '2021-10-22 18:34:09', '2021-10-27 18:34:13',
                     '2021-11-01 18:34:09', '2021-11-06 18:34:11',
                     '2021-11-11 18:34:07', '2021-11-16 18:34:08',
                     '2021-11-21 18:34:05', '2021-11-26 18:34:08',
                     '2021-12-01 18:34:03', '2021-12-06 18:34:07',
                     '2021-12-11 18:34:02', '2021-12-16 18:34:08',
                     '2021-12-21 18:34:02', '2021-12-26 18:34:09',
                     '2021-12-31 18:34:04'],
                    dtype='datetime64[ns]', name='time', freq=None))
    • y
      PandasIndex
      PandasIndex(Float64Index([4354685.0, 4354675.0, 4354665.0, 4354655.0, 4354645.0, 4354635.0,
                    4354625.0, 4354615.0, 4354605.0, 4354595.0,
                    ...
                    4334145.0, 4334135.0, 4334125.0, 4334115.0, 4334105.0, 4334095.0,
                    4334085.0, 4334075.0, 4334065.0, 4334055.0],
                   dtype='float64', name='y', length=2064))
    • x
      PandasIndex
      PandasIndex(Float64Index([-11040925.0, -11040915.0, -11040905.0, -11040895.0, -11040885.0,
                    -11040875.0, -11040865.0, -11040855.0, -11040845.0, -11040835.0,
                    ...
                    -11030885.0, -11030875.0, -11030865.0, -11030855.0, -11030845.0,
                    -11030835.0, -11030825.0, -11030815.0, -11030805.0, -11030795.0],
                   dtype='float64', name='x', length=1014))
  • crs :
    epsg:6933
    grid_mapping :
    spatial_ref
In [10]:
cloud_mask = (s2_dataset.SCL != 0) & (s2_dataset.SCL != 1) & \
             (s2_dataset.SCL != 3) & (s2_dataset.SCL != 8) & \
             (s2_dataset.SCL != 9) & (s2_dataset.SCL != 10)
In [11]:
s2_dataset=s2_dataset.rename_vars({'swir_1':'swir1','swir_2':'swir2'})
In [12]:
s2_dataset
Out[12]:
<xarray.Dataset>
Dimensions:      (time: 73, y: 2064, x: 1014)
Coordinates:
  * time         (time) datetime64[ns] 2021-01-05T18:34:07 ... 2021-12-31T18:...
  * y            (y) float64 4.355e+06 4.355e+06 ... 4.334e+06 4.334e+06
  * x            (x) float64 -1.104e+07 -1.104e+07 ... -1.103e+07 -1.103e+07
    spatial_ref  int32 6933
Data variables:
    red          (time, y, x) uint16 dask.array<chunksize=(1, 2064, 1014), meta=np.ndarray>
    green        (time, y, x) uint16 dask.array<chunksize=(1, 2064, 1014), meta=np.ndarray>
    blue         (time, y, x) uint16 dask.array<chunksize=(1, 2064, 1014), meta=np.ndarray>
    nir          (time, y, x) uint16 dask.array<chunksize=(1, 2064, 1014), meta=np.ndarray>
    swir1        (time, y, x) uint16 dask.array<chunksize=(1, 2064, 1014), meta=np.ndarray>
    swir2        (time, y, x) uint16 dask.array<chunksize=(1, 2064, 1014), meta=np.ndarray>
    SCL          (time, y, x) uint8 dask.array<chunksize=(1, 2064, 1014), meta=np.ndarray>
Attributes:
    crs:           epsg:6933
    grid_mapping:  spatial_ref
xarray.Dataset
    • time: 73
    • y: 2064
    • x: 1014
    • time
      (time)
      datetime64[ns]
      2021-01-05T18:34:07 ... 2021-12-...
      units :
      seconds since 1970-01-01 00:00:00
      array(['2021-01-05T18:34:07.000000000', '2021-01-10T18:34:08.000000000',
             '2021-01-15T18:34:07.000000000', '2021-01-20T18:34:08.000000000',
             '2021-01-25T18:34:08.000000000', '2021-01-30T18:34:08.000000000',
             '2021-02-04T18:34:07.000000000', '2021-02-09T18:34:07.000000000',
             '2021-02-14T18:34:06.000000000', '2021-02-19T18:34:07.000000000',
             '2021-02-24T18:34:06.000000000', '2021-03-01T18:34:08.000000000',
             '2021-03-06T18:34:07.000000000', '2021-03-11T18:34:07.000000000',
             '2021-03-16T18:34:07.000000000', '2021-03-21T18:34:07.000000000',
             '2021-03-26T18:34:07.000000000', '2021-03-31T18:34:05.000000000',
             '2021-04-05T18:34:05.000000000', '2021-04-10T18:34:03.000000000',
             '2021-04-15T18:34:03.000000000', '2021-04-20T18:34:03.000000000',
             '2021-04-25T18:34:03.000000000', '2021-04-30T18:34:05.000000000',
             '2021-05-05T18:34:05.000000000', '2021-05-10T18:34:07.000000000',
             '2021-05-15T18:34:07.000000000', '2021-05-20T18:34:08.000000000',
             '2021-05-25T18:34:08.000000000', '2021-05-30T18:34:09.000000000',
             '2021-06-04T18:34:09.000000000', '2021-06-09T18:34:08.000000000',
             '2021-06-14T18:34:09.000000000', '2021-06-19T18:34:08.000000000',
             '2021-06-24T18:34:08.000000000', '2021-06-29T18:34:10.000000000',
             '2021-07-04T18:34:09.000000000', '2021-07-09T18:34:11.000000000',
             '2021-07-14T18:34:10.000000000', '2021-07-19T18:34:11.000000000',
             '2021-07-24T18:34:10.000000000', '2021-07-29T18:34:11.000000000',
             '2021-08-03T18:34:09.000000000', '2021-08-08T18:34:10.000000000',
             '2021-08-13T18:34:08.000000000', '2021-08-18T18:34:10.000000000',
             '2021-08-23T18:34:06.000000000', '2021-08-28T18:34:09.000000000',
             '2021-09-02T18:34:04.000000000', '2021-09-07T18:34:08.000000000',
             '2021-09-12T18:34:04.000000000', '2021-09-17T18:34:10.000000000',
             '2021-09-22T18:34:05.000000000', '2021-09-27T18:34:12.000000000',
             '2021-10-02T18:34:08.000000000', '2021-10-07T18:34:13.000000000',
             '2021-10-12T18:34:09.000000000', '2021-10-17T18:34:13.000000000',
             '2021-10-22T18:34:09.000000000', '2021-10-27T18:34:13.000000000',
             '2021-11-01T18:34:09.000000000', '2021-11-06T18:34:11.000000000',
             '2021-11-11T18:34:07.000000000', '2021-11-16T18:34:08.000000000',
             '2021-11-21T18:34:05.000000000', '2021-11-26T18:34:08.000000000',
             '2021-12-01T18:34:03.000000000', '2021-12-06T18:34:07.000000000',
             '2021-12-11T18:34:02.000000000', '2021-12-16T18:34:08.000000000',
             '2021-12-21T18:34:02.000000000', '2021-12-26T18:34:09.000000000',
             '2021-12-31T18:34:04.000000000'], dtype='datetime64[ns]')
    • y
      (y)
      float64
      4.355e+06 4.355e+06 ... 4.334e+06
      units :
      metre
      resolution :
      -10.0
      crs :
      epsg:6933
      array([4354685., 4354675., 4354665., ..., 4334075., 4334065., 4334055.])
    • x
      (x)
      float64
      -1.104e+07 ... -1.103e+07
      units :
      metre
      resolution :
      10.0
      crs :
      epsg:6933
      array([-11040925., -11040915., -11040905., ..., -11030815., -11030805.,
             -11030795.])
    • spatial_ref
      ()
      int32
      6933
      spatial_ref :
      PROJCS["WGS 84 / NSIDC EASE-Grid 2.0 Global",GEOGCS["WGS 84",DATUM["WGS_1984",SPHEROID["WGS 84",6378137,298.257223563,AUTHORITY["EPSG","7030"]],AUTHORITY["EPSG","6326"]],PRIMEM["Greenwich",0,AUTHORITY["EPSG","8901"]],UNIT["degree",0.0174532925199433,AUTHORITY["EPSG","9122"]],AUTHORITY["EPSG","4326"]],PROJECTION["Cylindrical_Equal_Area"],PARAMETER["standard_parallel_1",30],PARAMETER["central_meridian",0],PARAMETER["false_easting",0],PARAMETER["false_northing",0],UNIT["metre",1,AUTHORITY["EPSG","9001"]],AXIS["Easting",EAST],AXIS["Northing",NORTH],AUTHORITY["EPSG","6933"]]
      grid_mapping_name :
      lambert_cylindrical_equal_area
      array(6933, dtype=int32)
    • red
      (time, y, x)
      uint16
      dask.array<chunksize=(1, 2064, 1014), meta=np.ndarray>
      units :
      1
      nodata :
      0
      crs :
      epsg:6933
      grid_mapping :
      spatial_ref
      Array Chunk
      Bytes 291.41 MiB 3.99 MiB
      Shape (73, 2064, 1014) (1, 2064, 1014)
      Dask graph 73 chunks in 1 graph layer
      Data type uint16 numpy.ndarray
      1014 2064 73
    • green
      (time, y, x)
      uint16
      dask.array<chunksize=(1, 2064, 1014), meta=np.ndarray>
      units :
      1
      nodata :
      0
      crs :
      epsg:6933
      grid_mapping :
      spatial_ref
      Array Chunk
      Bytes 291.41 MiB 3.99 MiB
      Shape (73, 2064, 1014) (1, 2064, 1014)
      Dask graph 73 chunks in 1 graph layer
      Data type uint16 numpy.ndarray
      1014 2064 73
    • blue
      (time, y, x)
      uint16
      dask.array<chunksize=(1, 2064, 1014), meta=np.ndarray>
      units :
      1
      nodata :
      0
      crs :
      epsg:6933
      grid_mapping :
      spatial_ref
      Array Chunk
      Bytes 291.41 MiB 3.99 MiB
      Shape (73, 2064, 1014) (1, 2064, 1014)
      Dask graph 73 chunks in 1 graph layer
      Data type uint16 numpy.ndarray
      1014 2064 73
    • nir
      (time, y, x)
      uint16
      dask.array<chunksize=(1, 2064, 1014), meta=np.ndarray>
      units :
      1
      nodata :
      0
      crs :
      epsg:6933
      grid_mapping :
      spatial_ref
      Array Chunk
      Bytes 291.41 MiB 3.99 MiB
      Shape (73, 2064, 1014) (1, 2064, 1014)
      Dask graph 73 chunks in 1 graph layer
      Data type uint16 numpy.ndarray
      1014 2064 73
    • swir1
      (time, y, x)
      uint16
      dask.array<chunksize=(1, 2064, 1014), meta=np.ndarray>
      units :
      1
      nodata :
      0
      crs :
      epsg:6933
      grid_mapping :
      spatial_ref
      Array Chunk
      Bytes 291.41 MiB 3.99 MiB
      Shape (73, 2064, 1014) (1, 2064, 1014)
      Dask graph 73 chunks in 1 graph layer
      Data type uint16 numpy.ndarray
      1014 2064 73
    • swir2
      (time, y, x)
      uint16
      dask.array<chunksize=(1, 2064, 1014), meta=np.ndarray>
      units :
      1
      nodata :
      0
      crs :
      epsg:6933
      grid_mapping :
      spatial_ref
      Array Chunk
      Bytes 291.41 MiB 3.99 MiB
      Shape (73, 2064, 1014) (1, 2064, 1014)
      Dask graph 73 chunks in 1 graph layer
      Data type uint16 numpy.ndarray
      1014 2064 73
    • SCL
      (time, y, x)
      uint8
      dask.array<chunksize=(1, 2064, 1014), meta=np.ndarray>
      units :
      1
      nodata :
      0
      flags_definition :
      {'qa': {'bits': [0, 1, 2, 3, 4, 5, 6, 7], 'values': {'0': 'no data', '1': 'saturated or defective', '2': 'dark area pixels', '3': 'cloud shadows', '4': 'vegetation', '5': 'bare soils', '6': 'water', '7': 'unclassified', '8': 'cloud medium probability', '9': 'cloud high probability', '10': 'thin cirrus', '11': 'snow or ice'}, 'description': 'Sen2Cor Scene Classification'}}
      crs :
      epsg:6933
      grid_mapping :
      spatial_ref
      Array Chunk
      Bytes 145.70 MiB 2.00 MiB
      Shape (73, 2064, 1014) (1, 2064, 1014)
      Dask graph 73 chunks in 1 graph layer
      Data type uint8 numpy.ndarray
      1014 2064 73
    • time
      PandasIndex
      PandasIndex(DatetimeIndex(['2021-01-05 18:34:07', '2021-01-10 18:34:08',
                     '2021-01-15 18:34:07', '2021-01-20 18:34:08',
                     '2021-01-25 18:34:08', '2021-01-30 18:34:08',
                     '2021-02-04 18:34:07', '2021-02-09 18:34:07',
                     '2021-02-14 18:34:06', '2021-02-19 18:34:07',
                     '2021-02-24 18:34:06', '2021-03-01 18:34:08',
                     '2021-03-06 18:34:07', '2021-03-11 18:34:07',
                     '2021-03-16 18:34:07', '2021-03-21 18:34:07',
                     '2021-03-26 18:34:07', '2021-03-31 18:34:05',
                     '2021-04-05 18:34:05', '2021-04-10 18:34:03',
                     '2021-04-15 18:34:03', '2021-04-20 18:34:03',
                     '2021-04-25 18:34:03', '2021-04-30 18:34:05',
                     '2021-05-05 18:34:05', '2021-05-10 18:34:07',
                     '2021-05-15 18:34:07', '2021-05-20 18:34:08',
                     '2021-05-25 18:34:08', '2021-05-30 18:34:09',
                     '2021-06-04 18:34:09', '2021-06-09 18:34:08',
                     '2021-06-14 18:34:09', '2021-06-19 18:34:08',
                     '2021-06-24 18:34:08', '2021-06-29 18:34:10',
                     '2021-07-04 18:34:09', '2021-07-09 18:34:11',
                     '2021-07-14 18:34:10', '2021-07-19 18:34:11',
                     '2021-07-24 18:34:10', '2021-07-29 18:34:11',
                     '2021-08-03 18:34:09', '2021-08-08 18:34:10',
                     '2021-08-13 18:34:08', '2021-08-18 18:34:10',
                     '2021-08-23 18:34:06', '2021-08-28 18:34:09',
                     '2021-09-02 18:34:04', '2021-09-07 18:34:08',
                     '2021-09-12 18:34:04', '2021-09-17 18:34:10',
                     '2021-09-22 18:34:05', '2021-09-27 18:34:12',
                     '2021-10-02 18:34:08', '2021-10-07 18:34:13',
                     '2021-10-12 18:34:09', '2021-10-17 18:34:13',
                     '2021-10-22 18:34:09', '2021-10-27 18:34:13',
                     '2021-11-01 18:34:09', '2021-11-06 18:34:11',
                     '2021-11-11 18:34:07', '2021-11-16 18:34:08',
                     '2021-11-21 18:34:05', '2021-11-26 18:34:08',
                     '2021-12-01 18:34:03', '2021-12-06 18:34:07',
                     '2021-12-11 18:34:02', '2021-12-16 18:34:08',
                     '2021-12-21 18:34:02', '2021-12-26 18:34:09',
                     '2021-12-31 18:34:04'],
                    dtype='datetime64[ns]', name='time', freq=None))
    • y
      PandasIndex
      PandasIndex(Float64Index([4354685.0, 4354675.0, 4354665.0, 4354655.0, 4354645.0, 4354635.0,
                    4354625.0, 4354615.0, 4354605.0, 4354595.0,
                    ...
                    4334145.0, 4334135.0, 4334125.0, 4334115.0, 4334105.0, 4334095.0,
                    4334085.0, 4334075.0, 4334065.0, 4334055.0],
                   dtype='float64', name='y', length=2064))
    • x
      PandasIndex
      PandasIndex(Float64Index([-11040925.0, -11040915.0, -11040905.0, -11040895.0, -11040885.0,
                    -11040875.0, -11040865.0, -11040855.0, -11040845.0, -11040835.0,
                    ...
                    -11030885.0, -11030875.0, -11030865.0, -11030855.0, -11030845.0,
                    -11030835.0, -11030825.0, -11030815.0, -11030805.0, -11030795.0],
                   dtype='float64', name='x', length=1014))
  • crs :
    epsg:6933
    grid_mapping :
    spatial_ref

Time Series Water Detection Analysis ▴¶

Time series output of the Australian Water Observations from Space (WOFS) results. The results show the percent of time that a pixel is classified as water over the entire time series. BLUE = frequent water, RED = infrequent water.

In [13]:
s2_dataset = s2_dataset.compute()
/env/lib/python3.10/site-packages/rasterio/warp.py:344: NotGeoreferencedWarning: Dataset has no geotransform, gcps, or rpcs. The identity matrix will be returned.
  _reproject(
/env/lib/python3.10/site-packages/rasterio/warp.py:344: NotGeoreferencedWarning: Dataset has no geotransform, gcps, or rpcs. The identity matrix will be returned.
  _reproject(
/env/lib/python3.10/site-packages/rasterio/warp.py:344: NotGeoreferencedWarning: Dataset has no geotransform, gcps, or rpcs. The identity matrix will be returned.
  _reproject(
/env/lib/python3.10/site-packages/rasterio/warp.py:344: NotGeoreferencedWarning: Dataset has no geotransform, gcps, or rpcs. The identity matrix will be returned.
  _reproject(
In [14]:
# WOFS is written for Landsat Collection 1, so we need to scale the Collection 2 data to look like collection 1
# This is stolen from https://github.com/GeoscienceAustralia/wofs/blob/master/wofs/virtualproduct.py
# Given that we are using Sentinel-2 data, this won't actually scale the data, but it will treat the data in the same way so that it is in the right format for the classifier
def scale_usgs_collection2(data):
    """These are taken from the Fractional Cover scaling values"""
    attrs = data.attrs
    data =  data.apply(scale_and_clip_dataarray, keep_attrs=False,
                       scale_factor=1, add_offset=0, # Don't actually scale or offset the data
                       clip_range=None, valid_range=(0, 10000))
    data.attrs = attrs
    return data

def scale_and_clip_dataarray(dataarray: xr.DataArray, *, scale_factor=1, add_offset=0, clip_range=None,
                             valid_range=None, new_nodata=-999, new_dtype='int16'):
    orig_attrs = dataarray.attrs
    nodata = dataarray.attrs['nodata']

    mask = dataarray.data == nodata

    # add another mask here for if data > 10000 then also make that nodata
    dataarray = dataarray * scale_factor + add_offset

    if clip_range is not None:
        dataarray = dataarray.clip(*clip_range)

    dataarray = dataarray.astype(new_dtype)

    dataarray.data[mask] = new_nodata
    if valid_range is not None:
        valid_min, valid_max = valid_range
        dataarray = dataarray.where(dataarray >= valid_min, new_nodata)
        dataarray = dataarray.where(dataarray <= valid_max, new_nodata)
    dataarray.attrs = orig_attrs
    dataarray.attrs['nodata'] = new_nodata

    return dataarray

s2_dataset_scaled = scale_usgs_collection2(s2_dataset) # Use the same function for converting landsat data from collection 2 to collection 1 values, but use scale and offset of 1 and 0.
In [15]:
from ceos_utils.data_cube_utilities.dc_water_classifier import wofs_classify
ts_water_classification = wofs_classify(s2_dataset_scaled,clean_mask = cloud_mask.values, no_data=0, x_coord='x', y_coord='y')
/home/jovyan/cal-notebooks/examples/../scripts/ceos_utils/data_cube_utilities/dc_water_classifier.py:136: RuntimeWarning: divide by zero encountered in divide
  return (a - b) / (a + b)
In [16]:
# Apply nan to no_data values
ts_water_classification = ts_water_classification.where(ts_water_classification != -9999).astype(np.float16)

# Time series aggregation that ignores nan values.    
water_classification_percentages = (ts_water_classification.mean(dim = ['time']) * 100).wofs.rename('water_classification_percentages')
In [17]:
# import color-scheme and set nans (no data) to black
from matplotlib.cm import jet_r
jet_r.set_bad('black',1)
In [18]:
img_scale = water_classification_percentages.shape[0]/water_classification_percentages.shape[1]
In [19]:
# This is where the WOFS time series product is generated. 
# Areas of RED have experienced little or no water over the time series
# Areas of BLUE have experience significant or constant water over the time series

figsize=6
water_classification_percentages.plot(cmap = jet_r, figsize=(figsize,figsize*img_scale))
plt.title("Percent of Samples Classified as Water")
plt.axis('off')
plt.show()
In [ ]: